Grammar Development for Czech Syntactic Parser with Corpus-based Techniques
نویسندگان
چکیده
In the paper, we present the description of the Czech syntactic parser synt developed at FI MU NLP laboratory. The presented system is based on the meta-grammar formalism with a head-driven chart parser. The parsing technique provides fast analysis of the context free backbone with successive evaluation of the contextual constraints using so called “forest of values.” The meta-grammar formalism allows to capture complicated grammatic relations with a maintainable number of rules. Besides the description of the synt system, we display the process of the meta-grammar development. One of the first phases is formed by construction of corpus data for testing. In the paper, we demonstrate the exploitation of the corpus on testing a method for detection of the “best analysis” selection with the results of testing the synt analysis on Czech corpus.
منابع مشابه
Studying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملFeature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملبرچسبزنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه
Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...
متن کاملSemi-automatic Syntactic and Semantic Corpus Annotation with a Deep Parser
We describe a semi-automatic method for linguistically rich corpus annotation using a broad-coverage deep parser to generate syntactic structure, semantic representation and discourse information for task-oriented dialogs. The parser-generated analyses are checked by trained annotators. Incomplete coverage and incorrect analyses are addressed through lexicon and grammar development, after which...
متن کاملPlatform for Full-Syntax Grammar Development Using Meta-grammar Constructs
This paper describes a combination of tools necessary for full or deep syntactic parsing of natural language – the syntactic parser synt, the graphical Grammar Development Workbench, GDW and the VerbaLex verb valency lexicon tools. We describe the development of the mentioned tools and how they integrate into one system that allows a team of experts (computational linguists as well as programme...
متن کامل